AITopics | empty space

Collaborating Authors

empty space

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

On the Limits of Innate Planning in Large Language Models

Schepanowski, Charles, Ling, Charles

arXiv.org Artificial IntelligenceNov-27-2025

Large language models (LLMs) achieve impressive results on many benchmarks, yet their capacity for planning and stateful reasoning remains unclear. We study these abilities directly, without code execution or other tools, using the 8-puzzle: a classic task that requires state tracking and goal-directed planning while allowing precise, step-by-step evaluation. Four models are tested under common prompting conditions (Zero-Shot, Chain-of-Thought, Algorithm-of-Thought) and with tiered corrective feedback. Feedback improves success rates for some model-prompt combinations, but many successful runs are long, computationally expensive, and indirect. We then examine the models with an external move validator that provides only valid moves. Despite this level of assistance, none of the models solve any puzzles in this setting. Qualitative analysis reveals two dominant deficits across all models: (1) brittle internal state representations, leading to frequent invalid moves, and (2) weak heuristic planning, with models entering loops or selecting actions that do not reduce the distance to the goal state. These findings indicate that, in the absence of external tools such as code interpreters, current LLMs have substantial limitations in planning and that further progress may require mechanisms for maintaining explicit state and performing structured search.

large language model, machine learning, puzzle, (17 more...)

arXiv.org Artificial Intelligence

2511.21591

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Efficient On-Policy Reinforcement Learning via Exploration of Sparse Parameter Space

Zhang, Xinyu, Deb, Aishik, Mueller, Klaus

arXiv.org Artificial IntelligenceOct-1-2025

Policy-gradient methods such as Proximal Policy Optimization (PPO) are typically updated along a single stochastic gradient direction, leaving the rich local structure of the parameter space unexplored. Previous work has shown that the surrogate gradient is often poorly correlated with the true reward landscape. Building on this insight, we visualize the parameter space spanned by policy checkpoints within an iteration and reveal that higher performing solutions often lie in nearby unexplored regions. To exploit this opportunity, we introduce ExploRLer, a pluggable pipeline that seamlessly integrates with on-policy algorithms such as PPO and TRPO, systematically probing the unexplored neighborhoods of surrogate on-policy gradient updates. Without increasing the number of gradient updates, ExploRLer achieves significant improvements over baselines in complex continuous control environments. Our results demonstrate that iteration-level exploration provides a practical and effective way to strengthen on-policy reinforcement learning and offer a fresh perspective on the limitations of the surrogate objective.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2509.25876

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.75)

Add feedback

Ludax: A GPU-Accelerated Domain Specific Language for Board Games

Todd, Graham, Padula, Alexander G., Soemers, Dennis J. N. J., Togelius, Julian

arXiv.org Artificial IntelligenceJul-1-2025

Games have long been used as benchmarks and testing environments for research in artificial intelligence. A key step in supporting this research was the development of game description languages: frameworks that compile domain-specific code into playable and simulatable game environments, allowing researchers to generalize their algorithms and approaches across multiple games without having to manually implement each one. More recently, progress in reinforcement learning (RL) has been largely driven by advances in hardware acceleration. Libraries like JAX allow practitioners to take full advantage of cutting-edge computing hardware, often speeding up training and testing by orders of magnitude. Here, we present a synthesis of these strands of research: a domain-specific language for board games which automatically compiles into hardware-accelerated code. Our framework, Ludax, combines the generality of game description languages with the speed of modern parallel processing hardware and is designed to fit neatly into existing deep learning pipelines. We envision Ludax as a tool to help accelerate games research generally, from RL to cognitive science, by enabling rapid simulation and providing a flexible representation scheme. We present a detailed breakdown of Ludax's description language and technical notes on the compilation process, along with speed benchmarking and a demonstration of training RL agents. The Ludax framework, along with implementations of existing board games, is open-source and freely available.

machine learning, natural language, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2506.22609

Country:

Europe > Switzerland (0.28)
Europe > Austria (0.28)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Generalizing End-To-End Autonomous Driving In Real-World Environments Using Zero-Shot LLMs

Dong, Zeyu, Zhu, Yimin, Li, Yansong, Mahon, Kevin, Sun, Yu

arXiv.org Artificial IntelligenceNov-21-2024

Traditional autonomous driving methods adopt a modular design, decomposing tasks into sub-tasks. In contrast, end-to-end autonomous driving directly outputs actions from raw sensor data, avoiding error accumulation. However, training an end-to-end model requires a comprehensive dataset; otherwise, the model exhibits poor generalization capabilities. Recently, large language models (LLMs) have been applied to enhance the generalization capabilities of end-to-end driving models. Most studies explore LLMs in an open-loop manner, where the output actions are compared to those of experts without direct feedback from the real world, while others examine closed-loop results only in simulations. This paper proposes an efficient architecture that integrates multimodal LLMs into end-to-end driving models operating in closed-loop settings in real-world environments. In our architecture, the LLM periodically processes raw sensor data to generate high-level driving instructions, effectively guiding the end-to-end model, even at a slower rate than the raw sensor data. This architecture relaxes the trade-off between the latency and inference quality of the LLM. It also allows us to choose from a wide variety of LLMs to improve high-level driving instructions and minimize fine-tuning costs. Consequently, our architecture reduces data collection requirements because the LLMs do not directly output actions; we only need to train a simple imitation learning model to output actions. In our experiments, the training data for the end-to-end model in a real-world environment consists of only simple obstacle configurations with one traffic cone, while the test environment is more complex and contains multiple obstacles placed in various positions. Experiments show that the proposed architecture enhances the generalization capabilities of the end-to-end model even without fine-tuning the LLM.

empty space, end-to-end model, obstacle, (16 more...)

arXiv.org Artificial Intelligence

2411.14256

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > New York > Suffolk County > Stony Brook (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(2 more...)

Genre: Research Report > New Finding (0.88)

Industry:

Transportation > Ground > Road (1.00)
Information Technology > Robotics & Automation (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Add feedback

ING-VP: MLLMs cannot Play Easy Vision-based Games Yet

Zhang, Haoran, Guo, Hangyu, Guo, Shuyue, Cao, Meng, Huang, Wenhao, Liu, Jiaheng, Zhang, Ge

arXiv.org Artificial IntelligenceOct-9-2024

As multimodal large language models (MLLMs) continue to demonstrate increasingly competitive performance across a broad spectrum of tasks, more intricate and comprehensive benchmarks have been developed to assess these cutting-edge models. These benchmarks introduce new challenges to core capabilities such as perception, reasoning, and planning. However, existing multimodal benchmarks fall short in providing a focused evaluation of multi-step planning based on spatial relationships in images. To bridge this gap, we present ING-VP, the first INteractive Game-based Vision Planning benchmark, specifically designed to evaluate the spatial imagination and multi-step reasoning abilities of MLLMs. ING-VP features 6 distinct games, encompassing 300 levels, each with 6 unique configurations. A single model engages in over 60,000 rounds of interaction. The benchmark framework allows for multiple comparison settings, including image-text vs. text-only inputs, single-step vs. multi-step reasoning, and with-history vs. without-history conditions, offering valuable insights into the model's capabilities. We evaluated numerous state-of-the-art MLLMs, with the highest-performing model, Claude-3.5 Sonnet, achieving an average accuracy of only 3.37%, far below the anticipated standard. This work aims to provide a specialized evaluation framework to drive advancements in MLLMs' capacity for complex spatial reasoning and planning. The code is publicly available at https://github.com/Thisisus7/ING-VP.git.

comp, history eff, instruction, (13 more...)

arXiv.org Artificial Intelligence

2410.06555

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Vietnam > Hanoi > Hanoi (0.06)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.95)

Add feedback

A Quality Diversity Approach to Automatically Generate Multi-Agent Path Finding Benchmark Maps

Qian, Cheng, Zhang, Yulun, Bhatt, Varun, Fontaine, Matthew Christopher, Nikolaidis, Stefanos, Li, Jiaoyang

arXiv.org Artificial IntelligenceSep-10-2024

We use the Quality Diversity (QD) algorithm with Neural Cellular Automata (NCA) to generate benchmark maps for Multi-Agent Path Finding (MAPF) algorithms. Previously, MAPF algorithms are tested using fixed, human-designed benchmark maps. However, such fixed benchmark maps have several problems. First, these maps may not cover all the potential failure scenarios for the algorithms. Second, when comparing different algorithms, fixed benchmark maps may introduce bias leading to unfair comparisons between algorithms. In this work, we take advantage of the QD algorithm and NCA with different objectives and diversity measures to generate maps with patterns to comprehensively understand the performance of MAPF algorithms and be able to make fair comparisons between two MAPF algorithms to provide further information on the selection between two algorithms. Empirically, we employ this technique to generate diverse benchmark maps to evaluate and compare the behavior of different types of MAPF algorithms such as bounded-suboptimal algorithms, suboptimal algorithms, and reinforcement-learning-based algorithms. Through both single-planner experiments and comparisons between algorithms, we identify patterns where each algorithm excels and detect disparities in runtime or success rates between different algorithms.

algorithm, kl divergence, proceedings, (14 more...)

arXiv.org Artificial Intelligence

2409.06888

Country:

North America > United States > California (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation (0.68)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Can Large Language Models Create New Knowledge for Spatial Reasoning Tasks?

Greatrix, Thomas, Whitaker, Roger, Turner, Liam, Colombo, Walter

arXiv.org Artificial IntelligenceMay-23-2024

The potential for Large Language Models (LLMs) to generate new information offers a potential step change for research and innovation. This is challenging to assert as it can be difficult to determine what an LLM has previously seen during training, making "newness" difficult to substantiate. In this paper we observe that LLMs are able to perform sophisticated reasoning on problems with a spatial dimension, that they are unlikely to have previously directly encountered. While not perfect, this points to a significant level of understanding that state-of-the-art LLMs can now achieve, supporting the proposition that LLMs are able to yield significant emergent properties. In particular, Claude 3 is found to perform well in this regard.

bing copilot, claude 3, polygon, (12 more...)

arXiv.org Artificial Intelligence

2405.14379

Country:

Asia > Middle East > Jordan (0.04)
North America > Canada > Manitoba > Westman Region > Brandon (0.04)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Games (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Add feedback

Landslide Topology Uncovers Failure Movements

Rana, Kamal, Bhuyan, Kushanav, Ferrer, Joaquin Vicente, Cotton, Fabrice, Ozturk, Ugur, Catani, Filippo, Malik, Nishant

arXiv.org Artificial IntelligenceOct-14-2023

Eery year, landslides cause economic damages worth 20 billion US dollars [1], and between 2004 and 2019 non-seismic landslides alone caused about 70, 000 fatalities worldwide [2]. Within the first two months of 2023, we have seen reports of devastating landslides in São Paulo, Brazil [3], Southern Peru [4], and New Zealand [5], injuring many and killing approximately 70 people. Adding to this, recent studies count over one million landslide occurrences with annual volumes estimated at fifty-six billion cubic meters globally [6], presenting a risk to sixty million people [7]. With the increase in urbanization, global climate change, and environmental change trends, the frequency of landslides and the associated risks will keep increasing globally over time [7]. In line with this, landslides are anticipated to evolve and remobilize with increased frequency under changing climatic conditions on a decadal scale [8, 9]. Our ability to identify hazards from emerging landslides and dynamically assess impact areas is essential in averting risk to rapidly urbanizing communities and adapting to changing environmental conditions [10, 7]. To address the rising landslide risk, predictive models for hazard, risk, and early warning systems are developed which assist in forecasting landslide occurrences and locating landslide-prone regions to mitigate the associated impacts [11]. However, the efficacy of these models is contingent on the quality of the underlying landslide databases.

failure type, landslide, topological property, (15 more...)

arXiv.org Artificial Intelligence

2310.09631

Country:

South America > Peru (0.24)
Oceania > New Zealand (0.24)
South America > Brazil > São Paulo (0.24)
(10 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Energy (0.93)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Software > Programming Languages (0.92)
Information Technology > Data Science (0.89)

Add feedback

Framework for 2D Ad placements in LinearTV

Bhargavi, Divya, Sindwani, Karan, Gholami, Sia

arXiv.org Artificial IntelligenceDec-5-2022

Virtual Product placement(VPP) is the advertising technique of digitally placing a branded object into the scene of a movie or TV show. This type of advertising provides the ability for brands to reach consumers without interrupting the viewing experience with a commercial break, as the products are seen in the background or as props. Despite this being a billion-dollar industry, ad rendering technique is currently executed at post production stage, manually either with the help of VFx artists or through semi-automated solutions. In this paper, we demonstrate a fully automated framework to digitally place 2-D ads in linear TV cooking shows captured using single-view camera with small camera movements. Without access to full video or production camera configuration, this framework performs the following tasks (i) identifying empty space for 2-D ad placement (ii) kitchen scene understanding (iii) occlusion handling (iv) ambient lighting and (v) ad tracking.

artificial intelligence, machine learning, proceedings, (13 more...)

arXiv.org Artificial Intelligence

2212.0245

Country: North America > United States (0.04)

Genre: Research Report (0.64)

Industry:

Media > Television (0.55)
Media > Film (0.55)
Media > Photography (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Data Science (0.93)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)

Add feedback

Emerging cooperation on the road by myopic local interactions

Rabinovich, Dmitry, Bruckstein, Alfred M.

arXiv.org Artificial IntelligenceSep-3-2022

In recent years the research in the field of autonomous vehicles has gained considerable momentum, and the idea of relieving the burden of driving from humans starts to lose its futuristic science fiction aura. Some people believe that autonomous traffic is "our last hope" of relief from the frequent road-jams, we now witness in even mid-size urban areas. We envision roads of the future with fully autonomous vehicles, that not only track the lane, keep safe distance and assist the driver, but essentially liberate humans from driving related activities altogether.

agent, empty space, vehicle, (14 more...)

arXiv.org Artificial Intelligence

2208.0376

Country:

Asia > Middle East > Israel (0.04)
North America > United States > California (0.04)
Europe > United Kingdom > England (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (1.00)

Industry: Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.92)

Add feedback